# Data Science

Fin R1
Fin-R1 is a large language model designed specifically for the financial field, aimed at enhancing financial reasoning capabilities. Jointly developed by Shanghai University of Finance and Economics and Caiyue Xingchen, it is fine-tuned and reinforced learning based on Qwen2.5-7B-Instruct, possessing efficient financial reasoning capabilities and is suitable for core financial scenarios such as banking and securities. This model is free and open-source, facilitating user adoption and improvement.
Finance
64.0K

Pruna
Pruna is a model optimization framework designed for developers. Through a series of compression algorithms, such as quantization, pruning, and compilation, it makes machine learning models faster, smaller, and less computationally expensive during inference. The product is suitable for various model types, including LLMs and vision transformers, and supports multiple platforms such as Linux, MacOS, and Windows. Pruna also offers an enterprise version, Pruna Pro, which unlocks more advanced optimization features and priority support, helping users improve efficiency in practical applications.
Development & Tools
81.4K

Ai Data Science Team
This product is an AI-driven data science team model designed to help users speed up their data science tasks. It automates and accelerates data science workflows through a series of specialized data science agents, such as data cleaning, feature engineering, and modeling. The primary advantage of this product is its ability to significantly enhance the efficiency of data science work, reduce manual intervention, and cater to enterprises and research institutions that need to quickly process and analyze large amounts of data. The product is currently in beta, actively under development, and may undergo significant changes. It is licensed under the MIT License, allowing users to use and contribute code for free on GitHub.
Data Analysis
54.9K

Vectrix Graphs
vectrix-graphs is a powerful graphical library focusing on the visualization of multi-model embeddings. It supports a variety of machine learning models and data types, presenting complex data structures in an intuitive graphical format. The main advantage of this library lies in its flexibility and extensibility, making it easy to integrate into existing data science workflows. Developed by the vectrix-ai team, this library aims to aid researchers and developers in better understanding and analyzing model embedding results. As an open-source project, it is available for free on GitHub, suitable for projects and teams of all sizes.
Data Analysis
48.3K
English Picks

Zasper
Zasper is an integrated development environment (IDE) specially designed for data science, built from the ground up to support large-scale concurrent processing. It features minimal memory usage, exceptional speed, and the ability to handle numerous concurrent connections. Zasper is highly suitable for running REPL-style data applications like Jupyter Notebook. Its main advantages include efficient concurrent processing and lightweight resource consumption, making it of significant value in the data science field. Currently, Zasper is available as an open-source version, ideal for data scientists and developers.
Development & Tools
61.5K
English Picks

Chatgpt Pro
ChatGPT Pro is a subscription product launched by OpenAI at a monthly fee of $200, providing scalable access to OpenAI's most advanced models and tools. The plan includes unlimited access to the OpenAI o1 model, as well as o1-mini, GPT-4o, and advanced voice capabilities. The o1 pro mode is a version of o1 that employs more computational resources to enable deeper thought and provide better answers, particularly when addressing the most challenging problems. ChatGPT Pro is designed to enhance productivity for researchers, engineers, and other individuals who utilize research-level intelligence in their daily work while remaining at the cutting edge of AI advances.
Chatbot
63.8K
English Picks

Gencast
GenCast is a new high-resolution (0.25°) AI ensemble model developed by Google DeepMind that is more accurate than the European Centre for Medium-Range Weather Forecasts (ECMWF) ENS system in predicting daily weather and extreme weather events, providing faster and more accurate forecasts up to 15 days in advance. This model is based on diffusion models and represents a type of generative AI model that has recently made rapid progress in image, video, and music generation. GenCast learns global weather patterns by analyzing historical weather data and can precisely generate complex probability distributions for future weather scenarios. The model's code, weights, and prediction results will be publicly released to support a wider weather forecasting community.
AI Model
72.0K

Pyramid Analytics
Pyramid Analytics is a business decision intelligence platform that integrates data preparation, business analytics, and data science, helping enterprises achieve fast and effective decision-making. The platform simplifies and guides the use of data in the decision-making process through AI-enhanced analytics, automation, and collaborative insights, lowering skill barriers and accelerating smarter decision integration across organizations. With its intuitive user experience and no-code platform, Pyramid Analytics provides a self-service experience for business users while offering robust management and governance capabilities for enterprises, ensuring that everyone can make decisions without compromise.
Business Intelligence
52.4K

Marimo
Marimo is an open-source reactive Python notebook that emphasizes reproducibility, is git-friendly, can be executed as scripts, and can be shared as applications. It automates the execution of affected cells in response to changes, removing the cumbersome task of managing notebook states. Marimo's UI elements, such as data frame GUIs and charts, make data processing swift, futuristic, and intuitive. Marimo notebooks are stored as .py files, compatible with git version control, executable as Python scripts, importable into other notebooks or Python files, and can be linted or formatted using your preferred tools—all within a modern AI-supported editor.
Notebooks
52.4K

Graphusion
Graphusion is a pipeline tool designed for extracting knowledge graph triples from text. It builds knowledge graphs through a series of steps, including concept extraction, candidate triple extraction, and triple fusion. This tool is significant as it automates the extraction of structured information from large volumes of text data, supporting knowledge management and data science projects. The main advantages of Graphusion include its automation capabilities, adaptability to different datasets, and flexible configuration options. Developed by tdurieux, the related code and documentation can be found on GitHub. Currently, the tool is free, but the pricing strategy may change based on developer updates and maintenance.
Research Equipment
57.4K

Datachain
DataChain is a contemporary Python data frame library tailored for artificial intelligence. It is designed to organize unstructured data into datasets and process data at scale on local machines. DataChain does not abstract or hide AI models and API calls but facilitates their integration into modern data stacks. The product's main advantages include its efficiency, ease of use, and powerful data processing capabilities, supporting a variety of data storage and processing methods, including images, videos, text, and more, while seamlessly interfacing with deep learning frameworks like PyTorch and TensorFlow. DataChain is open-source and follows the Apache-2.0 license, making it freely available for users.
Development and Tools
54.9K

Awesome LLM Resources
awesome-LLM-resources is a platform that aggregates global resources for large language models (LLMs), offering a range of tools and resources from data acquisition and fine-tuning to inference, evaluation, and real-world applications. Its significance lies in providing researchers and developers with a comprehensive resource library to facilitate the efficient development and optimization of their language models. Maintained by Wang Rongsheng, the platform is continuously updated, providing robust support for the advancement of the LLM field.
AI tools website directory
61.5K
English Picks

Deeplearning.ai
DeepLearning.AI is an online education platform founded by renowned AI expert Andrew Ng, focused on delivering high-quality courses and professional certificates in machine learning and deep learning. The platform offers beginners and professionals practical opportunities to learn AI skills and apply them. By collaborating with industry leaders, DeepLearning.AI ensures that course content is cutting-edge and practical, helping learners build a solid foundation in AI and advance their careers.
Education
56.3K

Xai
xAI is a company dedicated to building artificial intelligence that accelerates human scientific discovery. Led by Elon Musk, CEO of Tesla and SpaceX, our team has contributed some of the most widely used methods in the field, including the Adam optimizer, batch normalization, layer normalization, and the discovery of adversarial examples. We have further introduced innovative technologies and analyses such as Transformer-XL, Autoformalization, memory transformers, batch size scaling, μTransfer, and SimCLR. We have participated in and led some of the field's most groundbreaking developments, including AlphaStar, AlphaCode, Inception, Minerva, GPT-3.5, and GPT-4. Our team receives consulting from Dan Hendrycks, Director of AI Safety. We work closely with Company X to bring our technology to over 500 million users of application X.
Research Equipment
63.8K

Data Juicer
Data-Juicer is a comprehensive multimodal data processing system aimed at delivering higher quality, richer, and more digestible data for large language models (LLMs). It offers a systematic and reusable data processing library, supports collaborative development between data and models, allows rapid iteration through a sandbox lab, and provides features like data and model feedback loops, visualization, and multidimensional automated evaluation, helping users better understand and improve their data and models. Data-Juicer is actively maintained and regularly enhanced with more features, data recipes, and datasets.
AI Data Mining
66.2K

Ubiops
UbiOps is an AI infrastructure platform that helps teams run their AI and machine learning workloads as reliable and secure microservices without needing to change their existing workflows. It provides zero-DevOps super-fast pipeline, optimized compute resources, support for LLMs and CV models, and more. UbiOps supports hybrid and multi-cloud workload orchestration, enabling deployment of models in private or public cloud environments, ensuring data and models always stay in the user's environment. In addition, UbiOps also provides built-in security features such as end-to-end encryption, secure data storage, and access control, helping businesses comply with relevant regulations.
Development Platform
50.8K

LAMDA TALENT
LAMDA-TALENT is a comprehensive tabular data analysis toolbox and benchmarking platform that integrates over 20 deep learning methods, 10 traditional methods, and 300+ diverse tabular datasets. This toolbox aims to enhance model performance on tabular data, offers robust preprocessing capabilities, optimizes data learning, and supports user-friendly and adaptable operations suitable for both novice and expert data scientists.
AI Data Mining
50.8K
English Picks

Fiddlecube
FiddleCube is a product focused on the field of data science. It can quickly generate question-answer pairs from user data to help users evaluate large language models (LLMs). It provides accurate gold-standard datasets, supports various question types, and enables evaluation of data accuracy through metrics. Moreover, FiddleCube offers diagnostic tools to help users identify and improve underperforming queries.
Research Equipment
53.0K

AI Online Course
AI Online Course is an interactive learning platform that provides clear and concise introductions to artificial intelligence, making complex concepts easy to understand. It covers topics such as machine learning, deep learning, computer vision, self-driving cars, and chatbots, emphasizing practical applications and technological advancements.
Education
68.2K

Next AI Jobs
Next AI Jobs is a website that provides job and career opportunities in artificial intelligence, machine learning, natural language processing, and data science. It connects employers and job seekers in the AI industry, offering a wide range of development and career advancement opportunities. The main advantage of Next AI Jobs is its concentration of AI-related jobs and career options, providing job seekers with a more convenient path for professional development.
Job seeking
54.6K

Dreamseer
Dreamseer is an app that utilizes data science to interpret dreams, helping users gain a deeper understanding of themselves and achieve personal growth and evolution. Its core strengths include providing profound insights, fostering community collaboration, and expanding the realm of dream exploration. Dreamseer operates within the spheres of personal development and community engagement.
Personal Assistance
46.4K

Mygo
MyGO is a tool for multimodal knowledge graph completion. It processes discrete modal information as fine-grained labels to enhance completion accuracy. MyGO utilizes the transformers library to embed text labels and trains and evaluates on multimodal datasets. It supports custom datasets and provides training scripts for replicating experimental results.
AI Data Mining
70.1K
Fresh Picks

Machine Learning Engineer Learning Path
Google Cloud's Machine Learning Engineer Learning Path is a curated set of online courses and labs designed to equip learners with practical hands-on experience in Google Cloud technologies. It covers key skills in designing, building, deploying, optimizing, running, and maintaining machine learning systems. Upon completion of this learning path, learners can further pursue the Google Cloud Machine Learning Engineer certification, laying a solid foundation for career advancement.
Education
58.5K

Datacamp
DataCamp is an online learning platform offering courses in data science, AI, and related fields. It provides a hands-on learning experience through interactive exercises and short videos, covering a wide range of topics including Python, R, SQL, ChatGPT, and Power BI. DataCamp also offers certifications and resources for data science career development.
AI course
87.8K

Semantic Space Theory
Semantic Space Theory (SST) is the foundation of Hume AI research. It employs computational and data-driven methods to map the full spectrum of human emotions. Through natural data and advanced statistical methods, SST treats emotions as high-dimensional semantic spaces, revealing the complexity and subtle nuances of emotions.
AI Model
64.9K

Cleora.ai
Cleora PRO is a tool that empowers data science teams to create high-quality customer and product embedding vectors without expensive hardware. It represents entities (such as customers, products, stores, accounts, etc.) through embedding vectors, similar to Word2Vec or BERT for text, or CLIP for images. Cleora's embedding vectors are behavioral, representing entities based on their behavioral history, which exists in the form of a large graph. With Cleora PRO, you can build business models like recommendation systems, customer segmentation, propensity prediction, lifecycle value modeling, and churn prediction.
Data Analysis
47.5K

Fibonacciku
FibonacciKu is a personalized AI learning assistant designed for both teachers and students. It simplifies the learning process, making education more convenient.
Education
50.5K

MLJAR
MLJAR offers excellent data science tools and learning materials to help users understand and utilize their data. Its features include automated machine learning, transforming notebooks into interactive web applications, generating Python charts using LLMs, building custom SaaS applications, and server and website monitoring. MLJAR's strengths lie in its XAI capabilities, fairness in machine learning, model interpretability, fairness metrics, and its ability to quickly detect anomalies and send timely notifications. In terms of pricing, MLJAR offers various pricing plans and compares algorithms such as decision trees, random forests, Xgboost, LightGBM, and CatBoost. It is positioned as a data science tool.
Development & Tools
53.0K

Opendoc AI
OpenDoc AI is a tool that empowers everyone with data science capabilities, accelerating analytics, custom AI model building, and workflow automation tenfold. It automates data workflows using generative AI, provides clear AI instructions for company-wide use, transforms data into actionable insights without requiring training or data science expertise, and seamlessly connects to databases and processes various data types. OpenDoc AI is trusted and supported by teams of all sizes, bringing collaborative knowledge experiences to organizations across industries.
Data Analysis
57.1K

Onehouse
Onehouse is a universal data lakehouse offering open storage, continuous data streams, and automatic optimization across table formats, engines, and cloud platforms. Based on Hudi, Delta, and Iceberg, Onehouse is an automated data platform. It supports business intelligence, data science, and AI/ML, providing a unified lakehouse solution. Supporting both streaming and batch processing, Onehouse automatically manages data infrastructure with true openness and interoperability, saving costs and scaling to meet evolving needs. Created by the developers of Apache Hudi, Onehouse boasts high-throughput data stream ingestion, easy-to-change data capture, automated data management, cloud-native tables, and metadata.
Data Analysis
50.0K
- 1
- 2
Featured AI Tools
English Picks

Jules AI
Jules は、自動で煩雑なコーディングタスクを処理し、あなたに核心的なコーディングに時間をかけることを可能にする異步コーディングエージェントです。その主な強みは GitHub との統合で、Pull Request(PR) を自動化し、テストを実行し、クラウド仮想マシン上でコードを検証することで、開発効率を大幅に向上させています。Jules はさまざまな開発者に適しており、特に忙しいチームには効果的にプロジェクトとコードの品質を管理する支援を行います。
開発プログラミング
48.9K

Nocode
NoCode はプログラミング経験を必要としないプラットフォームで、ユーザーが自然言語でアイデアを表現し、迅速にアプリケーションを生成することが可能です。これにより、開発の障壁を下げ、より多くの人が自身のアイデアを実現できるようになります。このプラットフォームはリアルタイムプレビュー機能とワンクリックデプロイ機能を提供しており、技術的な知識がないユーザーにも非常に使いやすい設計となっています。
開発プラットフォーム
44.7K

Listenhub
ListenHub は軽量級の AI ポッドキャストジェネレーターであり、中国語と英語に対応しています。最先端の AI 技術を使用し、ユーザーが興味を持つポッドキャストコンテンツを迅速に生成できます。その主な利点には、自然な会話と超高品質な音声効果が含まれており、いつでもどこでも高品質な聴覚体験を楽しむことができます。ListenHub はコンテンツ生成速度を改善するだけでなく、モバイルデバイスにも対応しており、さまざまな場面で使いやすいです。情報取得の高効率なツールとして位置づけられており、幅広いリスナーのニーズに応えています。
AI
42.8K
Chinese Picks

腾讯混元画像 2.0
腾讯混元画像 2.0 は腾讯が最新に発表したAI画像生成モデルで、生成スピードと画質が大幅に向上しました。超高圧縮倍率のエンコード?デコーダーと新しい拡散アーキテクチャを採用しており、画像生成速度はミリ秒級まで到達し、従来の時間のかかる生成を回避することが可能です。また、強化学習アルゴリズムと人間の美的知識の統合により、画像のリアリズムと詳細表現力を向上させ、デザイナー、クリエーターなどの専門ユーザーに適しています。
画像生成
43.3K

Openmemory MCP
OpenMemoryはオープンソースの個人向けメモリレイヤーで、大規模言語モデル(LLM)に私密でポータブルなメモリ管理を提供します。ユーザーはデータに対する完全な制御権を持ち、AIアプリケーションを作成する際も安全性を保つことができます。このプロジェクトはDocker、Python、Node.jsをサポートしており、開発者が個別化されたAI体験を行うのに適しています。また、個人情報を漏らすことなくAIを利用したいユーザーにお勧めします。
オープンソース
45.3K

Fastvlm
FastVLM は、視覚言語モデル向けに設計された効果的な視覚符号化モデルです。イノベーティブな FastViTHD ミックスドビジュアル符号化エンジンを使用することで、高解像度画像の符号化時間と出力されるトークンの数を削減し、モデルのスループットと精度を向上させました。FastVLM の主な位置付けは、開発者が強力な視覚言語処理機能を得られるように支援し、特に迅速なレスポンスが必要なモバイルデバイス上で優れたパフォーマンスを発揮します。
画像処理
42.8K
English Picks

ピカ
ピカは、ユーザーが自身の創造的なアイデアをアップロードすると、AIがそれに基づいた動画を自動生成する動画制作プラットフォームです。主な機能は、多様なアイデアからの動画生成、プロフェッショナルな動画効果、シンプルで使いやすい操作性です。無料トライアル方式を採用しており、クリエイターや動画愛好家をターゲットとしています。
映像制作
17.6M
Chinese Picks

Liblibai
LiblibAIは、中国をリードするAI創作プラットフォームです。強力なAI創作能力を提供し、クリエイターの創造性を支援します。プラットフォームは膨大な数の無料AI創作モデルを提供しており、ユーザーは検索してモデルを使用し、画像、テキスト、音声などの創作を行うことができます。また、ユーザーによる独自のAIモデルのトレーニングもサポートしています。幅広いクリエイターユーザーを対象としたプラットフォームとして、創作の機会を平等に提供し、クリエイティブ産業に貢献することで、誰もが創作の喜びを享受できるようにすることを目指しています。
AIモデル
6.9M